CL: p_attributes

cl_cpos2str

cl_cpos2id

cl_id2str

cl_regex2id

cl_str2id

cl_id2freq

cl_id2cpos

CWB indexed corpora store the text of a corpus as numbers: Every token
in the token stream of the corpus is identified by a unique corpus
position. The string value of every token is identified by a unique integer
id. The corpus library (CL) offers a set of functions to make the transitions
between corpus positions, token ids, and the character string of tokens.

'Rcpp' Bindings for the C code of the 'Corpus Workbench' ('CWB'), an indexing and query
engine to efficiently analyze large corpora (<https://cwb.sourceforge.io>). 'RcppCWB' is licensed
under the GNU GPL-3, in line with the GPL-3 license of the 'CWB' (<https://www.r-project.org/Licenses/GPL-3>).
The 'CWB' relies on 'pcre2' (BSD license, see <https://github.com/PCRE2Project/pcre2/blob/master/LICENCE.md>)
and 'GLib' (LGPL license, see <https://www.gnu.org/licenses/lgpl-3.0.en.html>).
See the file LICENSE.note for further information. The package includes modified code of the
'rcqp' package (GPL-2, see <https://cran.r-project.org/package=rcqp>). The original work of the authors
of the 'rcqp' package is acknowledged with great respect, and they are listed as authors of this
package. To achieve cross-platform portability (including Windows), using 'Rcpp' for wrapper code
is the approach used by 'RcppCWB'.

Andreas Blaette

RcppCWB

'Rcpp' Bindings for the 'Corpus Workbench' ('CWB')

Bernard Desgraupes

Sylvain Loiseau

Oliver Christ

Bruno Maximilian Schulze

Stephanie Evert

Arne Fitschen

Jeroen Ooms

Marius Bertram

Tomas Kalibera

CL: p_attributes function

<dl><dt>corpus</dt>
<dd>name of a CWB corpus (upper case)</dd>
<dt>p_attribute</dt>
<dd>a p-attribute (positional attribute)</dd>
<dt>registry</dt>
<dd>path to the registry directory, defaults to the value of the
environment variable CORPUS_REGISTRY</dd>
<dt>cpos</dt>
<dd>corpus positions (integer vector)</dd>
<dt>id</dt>
<dd>id of a token</dd>
<dt>regex</dt>
<dd>a regular expression</dd>
<dt>str</dt>
<dd>a character string</dd></dl>

Arguments

Using Positional Attributes. — CL: p_attributes

<dl>

<dt>corpus</dt>
<dd>name of a CWB corpus (upper case)</dd>


<dt>p_attribute</dt>
<dd>a p-attribute (positional attribute)</dd>


<dt>registry</dt>
<dd>path to the registry directory, defaults to the value of the
environment variable CORPUS_REGISTRY</dd>


<dt>cpos</dt>
<dd>corpus positions (integer vector)</dd>


<dt>id</dt>
<dd>id of a token</dd>


<dt>regex</dt>
<dd>a regular expression</dd>


<dt>str</dt>
<dd>a character string</dd>

</dl>

Using Positional Attributes.

CL: p_attributes: Using Positional Attributes.

Description

Usage

Arguments

Examples